system_prompt = """
You are a cross-disciplinary scientist with expertise in atmospheric science and artificial intelligence.
You are highly familiar with state-of-the-art research on arXiv.
"""

system_prompt_json = system_prompt + """
Always respond strictly in valid JSON format, with no extra text or explanations.
"""

system_prompt_md = system_prompt + """
Always respond strictly in valid Markdown format with clear and readable structure.
Do not include the Markdown code block wrapper.
"""

user_prompt_filter = """
You will receive a list of papers from the Atmospheric and Oceanic Physics category on arXiv.
Your task is to methodically select papers from the list that meet both of the following criteria:
1. Atmospheric science relevance: The paper's primary focus must involve atmospheric science topics
   (for example: weather forecasting, climate prediction, data assimilation, meteorological datasets,
   numerical atmospheric modeling, or related areas). Exclude papers whose main focus is exclusively ocean science.
2. Deep learning involvement: The paper must directly involve deep learning—for example, it builds, applies,
   evaluates, or discusses deep learning models or methods. Exclude papers that use only traditional numerical modeling,
   classical (non-deep) machine learning (such as SVMs, decision trees), or purely statistical techniques.
As you review each paper, internally consider both criteria strictly. Only select papers that clearly fulfill both.
Base your decisions solely on the provided title and summary.
Your output must be a JSON object containing an array named "articles",
where each element is a string with the paper's "id" (HTTP link).
Respond with this JSON only and no other text.
Here is the paper list in JSON format. Each entry includes "id" (HTTP link), "title", "summary", and "pdf_link":
{articles_json}
"""

user_prompt_read = """
# Task
Summarize the attached paper according to the following structure and guidelines.

# Summary Structure
1. Plain Language Summary: In 2-3 sentences, concisely explain the paper's key achievements in simple,
   accessible language for a non-expert.
2. Data: Introduce the datasets used. Specify their source (e.g., ERA5, CMIP6) and their role in the study.
   For modeling papers, clearly define the input features and prediction targets.
   - Example: For a paper like GraphCast, state that the inputs are two previous time steps of ERA5 data and
     the target is the next time step, detailing the specific variables and temporal/spatial resolution.
3. Methods: Describe the core methodology, with a focus on the AI architecture. Be specific and avoid generalizations.
   Explain how the method works, not just that it was used.
   - Example: Instead of "NowcastNet uses deep learning with physical principles," explain,
     "NowcastNet decomposes precipitation into intensity and motion components, predicting each separately to
     forecast future precipitation."
4. Results: Summarize the top-level findings and the authors' main conclusions. Avoid a detailed, figure-by-figure
   description unless a specific result is exceptionally novel or surprising.
5. Q&A: Use 1-2 Q&A pairs to highlight the main techniques of the paper. Imagine the readers' questions
   as they read your summary.
   - Example: In end-to-end modeling, readers might wonder how the model handles scattered observational data;
     in S2S forecasting, they might be interested in how the model accounts for external forcings like SST.
6. Recommendation Score: Rate the paper on a scale of 1 (Limited Value) to 5 (Must Read) based on
   your expert assessment. Situate the paper within the current research landscape, particularly considering trends
   since 2023. Compare and contrast its approach and findings with well-known papers if possible.
   Avoid vague praise. Do not hesitate to assign a lower score. Most papers should fall between 2 and 4.

# Guiding Principles
1. Clarity for Decision-Making: The summary must empower a reader to quickly grasp the paper's essence and
   decide if it warrants a full read. Be concise and prioritize high-impact information.
2. Reproducibility Mindset: Highlight the main points in the Data and Methods sections so that a knowledgeable reader
   could mentally outline the experimental setup.
3. Target Audience: Assume the reader has an undergraduate-level understanding of both atmospheric science and AI.
   Define or simplify highly specialized terms beyond this scope (e.g., use "diffusion model" instead of "DDPM";
   use “satellite” or “polar-orbiting satellite” instead of “METOP-A”;
   explain "LoRA" as "an efficient fine-tuning technique").
4. Completeness Check: Review your summary from the perspective of a scientist unfamiliar with the paper.
   Ensure there are no major logical gaps or unexplained concepts.
5. The text is extracted from a PDF of the paper, thoughtfully consider possible biases—such as mistaking page numbers
   for content—and avoid including such extraneous information in your analysis.

Please respond with a well-structured Markdown document.
"""

user_prompt_summary = """
Here are information of {num_articles} articles on AI applications in atmospheric science.
Each article is prefaced with its metadata (its title, published time, authors, and link) followed by its summary.

Please synthesize these summaries into a single, well-structured Markdown document with the following sections:
---
# Executive Summary
In a concise paragraph, synthesize the overarching trends from these papers. Focus on answering:
1. What key problems are being addressed (e.g., forecasting, emulation, downscaling)?
2. What methods or architectures are currently dominant or emerging (e.g., Transformers, GNNs, Diffusion)?
3. What common datasets or baselines are being used to drive progress?

# Highlights
Identify the 3-5 most (if possible) compelling papers from the list.
These should be papers you consider "must-reads" due to high novelty, significant results, or high potential impact.
For each highlighted paper:
1. List the paper title.
2. Provide a 1-2 sentence justification explaining why it stands out
   - Example:
     - This paper introduces a novel architecture that significantly outperforms the ECMWF-IFS.
     - It's the first to successfully apply diffusion models to S2S prediction.

# Future Directions
Based on the collective insights and limitations of these papers, suggest 2-3 promising research avenues.
Think synthetically:
1. In what other fields could the methods presented in these papers be applied?
2. What limitations do these papers have that should be addressed and potentially resolved?
3. Based on current progress, what are the next logical challenges to pursue?

# Article List
Create a Markdown table that catalogs all the provided papers. This table should serve as a quick reference guide.
The table columns should be:
- Title. Please strictly use the title information provided in the metadata.
- Keywords. Generate 3-5 keywords for the paper, separated by commas.
- Score. Rate the paper from 1 to 5 based on the summary score, using star symbols (e.g., ★★★☆☆).
- Link. Include the link from the metadata (e.g., https://arxiv.org/pdf/2311.07222),
  ensuring strict adherence to the URL in the metadata.
---

Here is the list of articles:
"""
